Paraphrase Alignment for Synonym Evidence Discovery
نویسندگان
چکیده
We describe a new unsupervised approach for synonymy discovery by aligning paraphrases in monolingual domain corpora. For that purpose, we identify phrasal terms that convey most of the concepts within domains and adapt a methodology for the automatic extraction and alignment of paraphrases to identify paraphrase casts from which valid synonyms are discovered. Results performed on two different domain corpora show that general synonyms as well as synonymic expressions can be identified with a 67.27% precision.
منابع مشابه
Extracting paraphrase patterns from bilingual parallel corpora
Paraphrase patterns are semantically equivalent patterns, which are useful in both paraphrase recognition and generation. This paper presents a pivot approach for extracting paraphrase patterns from bilingual parallel corpora, whereby the paraphrase patterns in English are extracted using the patterns in another language as pivots. We make use of log-linear models for computing the paraphrase l...
متن کاملDeveloping Monolingual English Corpus for Plagiarism Detection using Human Annotated Paraphrase Corpus
In this paper, we describe an approach to create monolingual English plagiarism detection corpus for the task of text alignment corpus construction in PAN 2015 competition. We propose two different obfuscation methods to fragment obfuscation for creating the cases of plagiarism. The first method is an artificial obfuscation which consists of variety of obfuscation strategies such as synonym sub...
متن کاملA Case Study Towards Turkish Paraphrase Alignment
Paraphrasing is expressing the same semantic content using different linguistic means. Although previous work has addressed linguistic variations at different levels of language, paraphrasing in Turkish has not been yet thoroughly studied. This paper presents the first study towards Turkish paraphrase alignment. We perform an analysis of different types of paraphrases on a modest Turkish paraph...
متن کاملWord Alignment with Synonym Regularization
We present a novel framework for word alignment that incorporates synonym knowledge collected from monolingual linguistic resources in a bilingual probabilistic model. Synonym information is helpful for word alignment because we can expect a synonym to correspond to the same word in a different language. We design a generative model for word alignment that uses synonym information as a regulari...
متن کاملRe-evaluating Machine Translation Results with Paraphrase Support
In this paper, we present ParaEval, an automatic evaluation framework that uses paraphrases to improve the quality of machine translation evaluations. Previous work has focused on fixed n-gram evaluation metrics coupled with lexical identity matching. ParaEval addresses three important issues: support for paraphrase/synonym matching, recall measurement, and correlation with human judgments. We ...
متن کامل